Journal: bioRxiv
Article Title: Coralysis enables sensitive identification of imbalanced cell types and states in single-cell data via multi-level integration
doi: 10.1101/2025.02.07.637023
Figure Lengend Snippet: A Unintegrated pancreatic reference UMAP highlighting the batch - sequencing libraries - and cell type labels (left-right). B Integrated pancreatic reference UMAP obtained with Coralysis highlighting the respective batch and cell type labels (left-right). C Queries projected onto the reference integrated UMAP with Coralysis reference-mapping method. Colours highlight dataset identity, i.e., reference and query (indrop2 and smartseq2) labels (left), and ground-truth cell-type labels (right). D Queries projected onto the reference UMAP with reference cells removed for clarity. Colours highlight query identity and predicted cell labels (left-right). UMAP highlighting ground-truth, predictions and confidence scores for E indrop2 and G smartseq2 queries projected onto the reference UMAP. Confusion matrix of predicted versus ground-truth cell labels for F indrop2 and H smartseq2 query predictions. Top colour bar represents cell type labels. The confidence scores represent the proportion of K neighbours from the winning class ( K =10). The values in the confusion matrices correspond to the number of cells matching each other and the values at the end of each row, the number of total misclassifications for every predicted cell type. The heatmap colors in the matrices represent the frequency of predicted cell types in percentage.
Article Snippet: A set of pancreatic scRNA-seq datasets comprising eight samples (celseq, celseq2, smartseq2, fluidigmc1, indrop1, indrop2, indrop3, indrop4) sequenced across five library preparation technologies (SMARTSeq2, Fluidigm C1, CelSeq, CelSeq2, inDrops) was used to assess the performance of reference-mapping with and without shared batch effects.
Techniques: Sequencing